Convert ink_queue implementation to std::atomic#13170
Conversation
e507f16 to
caf1333
Compare
| SET_FREELIST_POINTER_VERSION(next, FROM_PTR(nullptr), FREELIST_VERSION(item) + 1); | ||
| result = ink_atomic_cas(&l->head.data, item.data, next.data); | ||
| } while (result == 0); | ||
| { |
There was a problem hiding this comment.
I removed this unnecessary block nesting.
| while (e) { | ||
| head_p *e_ = to_head_p(e, l->offset); | ||
| void *n = TO_PTR(FREELIST_POINTER(*e_)); | ||
| SET_FREELIST_POINTER_VERSION(*e_, n, FREELIST_VERSION(*e_)); | ||
| e = n; |
There was a problem hiding this comment.
If the offset is applied before checking for nullptr, it is possible to pass the tscore tests but fail a bunch of other unit tests (event system and cache tests).
caf1333 to
a106ab4
Compare
There was a problem hiding this comment.
Pull request overview
This PR refactors the ink_queue freelist / atomic list implementation to use std::atomic-based state (including a revised head_p representation) and adds unit tests/benchmarks to validate and measure the behavior. The goal is to eliminate UB from type punning and improve correctness around alignment and concurrency.
Changes:
- Refactor
head_pto an integral type and introducememcpy-based view/load/store helpers for pointer+version packing. - Update freelist/atomiclist operations to use
std::atomic(and add mutex-based mutual exclusion for pop paths). - Add Catch2 unit tests/benchmarks for freelist and atomic list behavior, and update build configuration accordingly.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
include/tscore/ink_queue.h |
Refactors head_p, adds atomic/mutex members to list types, and updates pointer/version access macros. |
src/tscore/ink_queue.cc |
Migrates freelist/atomiclist logic to std::atomic + new packing helpers; adds alignment checks and mutexes. |
src/tscore/unit_tests/test_ink_queue.cc |
New Catch2 unit tests and benchmarks for freelist/atomic list behavior. |
src/tscore/CMakeLists.txt |
Adds the new unit test and links atomic. |
src/proxy/logging/LogObject.cc |
Adapts CAS usage / version typing to the new head_p API. |
* Fix macro definitions for all platforms The definitions still used the `data` member of the old `head_p` struct. * Clean up `head_p_view` definitions * Remove unused `<iostream>` include * Make `freelist_init` alignment check consistent * Test atomiclist with offset * Add @file description to test_ink_queue.cc * Add missing `<vector>` include to unit tests * Construct `InkFreeList` with placement new
* Only link libatomic when using 128bit atomics and lib exists
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
cmake/Check128BitCas.cmake:45
CHECK_PROGRAMas written will not compile as C++:std::atomic<__int128>::compare_exchange_strongtakesexpectedby reference anddesiredas the second parameter, but here both arguments are rvalues and reversed. Also this file includesCheckCSourceCompilesbut callscheck_cxx_source_compiles, and the fallback usescheck_c_source_compileseven though the program is C++ (<atomic>). Please switch toinclude(CheckCXXSourceCompiles)and use a valid C++ compare-exchange snippet (with a mutableexpected) for both checks (including the-mcx16probe).
set(CHECK_PROGRAM
"
#include <atomic>
int main()
{
std::atomic<__int128> x{0};
return x.compare_exchange_strong(10, 0);
}
"
)
include(CheckCSourceCompiles)
check_cxx_source_compiles("${CHECK_PROGRAM}" TS_HAS_128BIT_CAS)
if(NOT TS_HAS_128BIT_CAS)
unset(TS_HAS_128BIT_CAS CACHE)
set(CMAKE_REQUIRED_FLAGS "-Werror -mcx16")
check_c_source_compiles("${CHECK_PROGRAM}" TS_HAS_128BIT_CAS)
set(NEED_MCX16 ${TS_HAS_128BIT_CAS})
* Store freelist nodes as `void*`, not `head_p` * Remove redundant freelist allocation in tests
* Fix 128bit CAS test program
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (2)
include/tscore/ink_queue.h:250
- INK_ATOMICLIST_EMPTY still passes the std::atomic<head_p> member directly into FREELIST_POINTER/TO_PTR. Since std::atomic is not implicitly convertible to head_p, this will not compile (and call sites like ProtectedQueue rely on this macro). Load the atomic (e.g., use .head.load(...)) before applying FREELIST_POINTER/TO_PTR, and update both NT/non-NT macro branches accordingly.
#if !defined(INK_QUEUE_NT)
#define INK_ATOMICLIST_EMPTY(_x) (!(TO_PTR(FREELIST_POINTER((_x.head)))))
#else
/* ink_queue_nt.c doesn't do the FROM/TO pointer swizzling */
#define INK_ATOMICLIST_EMPTY(_x) (!((FREELIST_POINTER((_x.head)))))
#endif
cmake/Check128BitCas.cmake:46
- Check128BitCas.cmake calls check_cxx_source_compiles but only includes CheckCSourceCompiles, which does not define that macro in standard CMake. Also, the fallback path still uses check_c_source_compiles even though CHECK_PROGRAM is now C++ (includes , std::atomic), so the -mcx16 probe will always fail. Include CheckCXXSourceCompiles and use check_cxx_source_compiles for both probes.
include(CheckCSourceCompiles)
check_cxx_source_compiles("${CHECK_PROGRAM}" TS_HAS_128BIT_CAS)
if(NOT TS_HAS_128BIT_CAS)
unset(TS_HAS_128BIT_CAS CACHE)
set(CMAKE_REQUIRED_FLAGS "-Werror -mcx16")
check_c_source_compiles("${CHECK_PROGRAM}" TS_HAS_128BIT_CAS)
set(NEED_MCX16 ${TS_HAS_128BIT_CAS})
| # on 64-bit architectures, freelist pointers are stored in a special bitwise | ||
| # format, so LSan cannot find link pointers. It will erroneously determine | ||
| # that the list tail is not reachable. | ||
| leak:freelist_new |
| struct InkAtomicList { | ||
| InkAtomicList() {} | ||
| head_p head{}; | ||
| const char *name = nullptr; | ||
| uint32_t offset = 0; | ||
| std::mutex m; | ||
| std::atomic<head_p> head{}; | ||
| const char *name = nullptr; | ||
| uint32_t offset = 0; | ||
| }; |
| TEST_CASE("Freelist benchmarks", "[freelist][bench]") | ||
| { | ||
| BENCHMARK_ADVANCED("Single threaded alloc")(Catch::Benchmark::Chronometer meter) | ||
| { | ||
| InkFreeList *f{ink_freelist_create("test#1", sizeof(std::int32_t), 4, alignof(std::int32_t))}; | ||
|
|
||
| ink_freelist_new(f); | ||
|
|
||
| meter.measure([&]() { return ink_freelist_new(f); }); | ||
| }; | ||
|
|
||
| BENCHMARK_ADVANCED("Single threaded free")(Catch::Benchmark::Chronometer meter) | ||
| { | ||
| InkFreeList *f{ink_freelist_create("test#1", sizeof(std::int32_t), 4, alignof(std::int32_t))}; | ||
|
|
||
| std::vector<void *> ptrs; | ||
| ptrs.resize(meter.runs()); | ||
| for (auto &x : ptrs) { | ||
| x = ink_freelist_new(f); | ||
| } | ||
|
|
||
| meter.measure([&](int i) { return ink_freelist_free(f, ptrs[i]); }); | ||
| }; | ||
| } |
* Change remaining `alignof(head_p)` to `alignof(void*)`
The PR title is fairly self-explanatory, but the design choices here deserve explicit mention.
TS_HAS_128BIT_CASnow depends on__atomicinstead of__syncIn our build pipeline, OSX and FreeBSD support this.
This has been done to eliminate type punning, which was done all over the implementation, and is UB. The pointer and version are now always set through the macros
FREELIST_POINTER,FREELIST_VERSION, andSET_FREELIST_POINTER_VERSION, which use a separate {pointer, version} struct type andmemcpyon platforms where this is appropriate (see preprocessor defs for the list).This is necessary so that the head of the list will be an atomic integer type instead of an atomic class type to be sure it can use 128bit atomic hardware instructions on platforms that support them.
void*alignment requirementsThis is a minor bug in master.
This one is annoying and might need to be reverted from this PR and considered separately. This particular change was made to fully fix #11640 - there is a minor data race without it in that the second pointer from the list head can be overwritten by an allocator's placement new before it is read without synchronization in
freelist_new(a similar argument applies toatomiclist_pop;atomiclist_popallis unaffected) by another thread, which is going to subsequently find out the list head is stale and retry. Thus, the garbage pointer is not dereferenced, but this is still UB.I have been thinking about approaches here. One approach is to add an atomic flag to the list head that is set by any thread popping from the list. A thread that has successfully popped can spin on that flag to wait for the completion of any other threads still reading the memory it is about to return. My intuition is that this will be better, but I don't know without benchmarking, and it's a lot more complex than the lock.
Fixes #7398
Fixes #11640 in release mode only (dummy_forced_read calls still race)
Previous Work
See #7382. This PR is only a step in the direction of #7382; it retains a lot of the old code structure along with most of its design flaws. If this change is accepted, it should thereafter be possible to apply other design improvements from #7382, such as the fleshed out versioned pointer type, with greater confidence.
A Few Comments About Assertions
This PR adds a hoard of assertions that check alignment requirements. Most of them are debug assertions - the alignment check on the pointer passed to
freelist_pushis a release assert for now, because it would almost certainly indicate a major issue if it triggered. According to the comment from @bryancall, this assertion was in fact failing before (it was previously a debug assert, and he commented it out). I am hopeful that that issue is now resolved.Performance Implications
TODO